Monitoring Deforestation In Landsat Data Cubes Using NDVI
The deforestation of the rainforest is a important topic in science and public. Especially due to the worsening of climate change, there are more and more calls for countries that contain large parts of the rainforest to stop deforestation. This is especially true for Brazil, which has the largest amount of tropical rainforest in the world, covering an area of 318,7 ha (statista). For a long time it seemed that Brazil could counteract deforestation. At the beginning of the 2010s, Brazil managed to reduce deforestation of the rainforest by 84%, compared to the historic peak of 2004. Thus, in 2012, only 4,571 km2 were cut down instead of 27,772 km2 in 2004 (nature ecology & evolution: The Brazilian Amazon deforestation rate in 2020 is the greatest of the decade). However, this trend has reversed since 2013. The Brazilian Amazon Deforestation Monitoring Program (PRODES) estimated deforestation of 11,088 km2 for 2020, which corresponds to the highest value for the entire decade. This upward trend in deforestation was triggered by changes in the Brazilian Forest Code and new laws which may allow illegally grabbed public land to be legalized (Clarifying Amazonia’s burning crisis). Thus, illegal clearing of the rainforest, which is mostly difficult to record, occurs over and over again.
With this work we present a tool to capture deforestation of the Brazilian rainforest in near real time by analyzing the NDVI in Landsat Data Cubes. Using previously collected reference data for a study area in the Brazilian Amazon, we calculate the changes in NDVI after deforestation in order to use these values for future detection of deforestation. By comparing the NDVI change of satellite images from two timesteps, we analyze deforestation. A land is identified as deforested if the NDVI decrease is over a determined threshold. To make the satellite images usable, clouds were removed and thus missing pixel values were replaced by the observated values of future timestamps. In addition seasonal variations where removed, which would otherwise have disguised erroneously deforestation in the data. The methodology was largely taken from the paper Monitoring Deforestation at Sub-Annual Scales as Extreme Events in Landsat Data Cubes, leaving out some components that were not interesting for us and adding others that we needed, e.g. for the near real time analysis. In the following, we specify our used data, briefly describe the used methods as well as results and discuss them afterwards.
To perform our calculations, we choose a study area located in the northwest of Brazil. This area is located along the border of the Brazilian states Randônia, Amazonas and bordering the Bolivian state of Pando (compare Figure TODO). We choose this study area because it is completely covered by the area of the native rainforest and is also affected by deforestation. To carry out our methods, two types of data were needed. On the one hand satellite data to calculate the NDVI and on the other hand reference data when and where specifc areas were recognized as deforested. We need both data to inspect the NDVI change of an deforested area. The satellite data we used are Landsat 8 data in GeoTIFF format in the CRS WGS 84 / UTM zone 19N, with a spatial extent of [left = -7347259, right = -7314864, top = -995476.1, bottom = -1025490] and a temporal extent 2014-01-01 until 2019-12-31. The raw data is available (here). In addition to the data needed to calculate the NDVI, data was also needed where deforestation had already been detected. We obtained these data from the PRODES program, already mentioned in the introduction. They publish data for the deforested areas in Brazil larger than 6.25 hectares starting from 2008. Wherby they considers deforestation to be the suppression of native vegetation, regardless of the future use of these areas. We have downloaded the corresponding data as shapefiles (here).
As already briefly mentioned in the introduction, the basic idea of our approach comes from the paper Monitoring Deforestation at Sub-Annual Scales as Extreme Events in Landsat Data Cubes. Their basic idea was to divide an NDVI image time series into a reference cube and a monitoring cube. The reference cube contains historical observations where non-forest pixels have been masked, and the monitoring cube contains newly acquired observations, not yet assessed for deforestation. Based on training data and observations in the reference cube a threshold percentile for defining an observation as abormally low was calculated and applied for each pixel value in the monitoring cube. If the pixel is below the threshold percentile, the pixel is flagged as deforested or potential deforested. We adopted this basic idea and also carried out usual preparatory steps, such as a filling of cloud gaps and a deseasonalization of the data.
First, both the reference data and satellite images must be converted into a format that we can use. To start with the reference data. The data was loaded from PRODES into QGIS to prepare it accordingly. We changed the coordinate reference system from SIRGAS 2000 to EPSG:3857 - WGS 84/ Pseudo-Mercator for these reference data as our satellite data were already in this format. Since the data were always available for individual days, we have combined them into months. As the PRODES data are shapefiles, they were converted into the raster format which is also used for our satellite data. The last step was to crop the data to the study area described above. Similar steps that were performed for the PRODES data in QGIS, were performed for the satellite data in R using gdalcubes.
We split the images in a reference time series with a temporal extent from 2014 to 2018 and a monitoring time series with a temporal extent for 2019. Therefore we create gdal cubes. With the help of a mask, only those pixels are considered that are either clear or water in order to avoid calculation difficulties caused by clouds. For the reference time series, two data cubes are created, one with a spatial resolution of 250m for the calculation of the NDVI and one with a spatial resolution of 1000m for the deseasonalization of the data. This lower resolution in general but especially for the deseasonalization was necessary due to lack of computer power. With the creation of the data cubes, the spatial extent is set to the study area, the temporal extent was limited to the above mentioned dates, whereby the size of of pixels in time-direction was about one month.
The NDVI is then calculated for both data cubes. Using a mask to work only with cloud-free pixels inevitably results in pixels with na values that are not of the two selected classes for the mask.